565 research outputs found

    Noise-robust detection of peak-clipping in decoded speech

    Get PDF

    Acoustic simultaneous localization and mapping (A-SLAM) of a moving microphone array and its surrounding speakers

    Get PDF
    Acoustic scene mapping creates a representation of positions of audio sources such as talkers within the surrounding environment of a microphone array. By allowing the array to move, the acoustic scene can be explored in order to improve the map. Furthermore, the spatial diversity of the kinematic array allows for estimation of the source-sensor distance in scenarios where source directions of arrival are measured. As sound source localization is performed relative to the array position, mapping of acoustic sources requires knowledge of the absolute position of the microphone array in the room. If the array is moving, its absolute position is unknown in practice. Hence, Simultaneous Localization and Mapping (SLAM) is required in order to localize the microphone array position and map the surrounding sound sources. In realistic environments, microphone arrays receive a convolutive mixture of direct-path speech signals, noise and reflections due to reverberation. A key challenge of Acoustic SLAM (a-SLAM) is robustness against reverberant clutter measurements and missing source detections. This paper proposes a novel bearing-only a-SLAM approach using a Single-Cluster Probability Hypothesis Density filter. Results demonstrate convergence to accurate estimates of the array trajectory and source positions

    Spherical microphone array acoustic rake receivers

    Get PDF
    Several signal independent acoustic rake receivers are proposed for speech dereverberation using spherical microphone arrays. The proposed rake designs take advantage of multipaths, by separately capturing and combining early reflections with the direct path. We investigate several approaches in combining reflections with the direct path source signal, including the development of beam patterns that point nulls at all preceding reflections. The proposed designs are tested in experimental simulations and their dereverberation performances evaluated using objective measures. For the tested configuration, the proposed designs achieve higher levels of dereverberation compared to conventional signal independent beamforming systems; achieving up to 3.6 dB improvement in the direct-to-reverberant ratio over the plane-wave decomposition beamformer

    Estimation of glottal closure instants in voiced speech using the DYPSA algorithm

    Get PDF
    Published versio

    Bearing-only acoustic tracking of moving speakers for robot audition

    Get PDF
    This paper focuses on speaker tracking in robot audition for human-robot interaction. Using only acoustic signals, speaker tracking in enclosed spaces is subject to missing detections and spurious clutter measurements due to speech inactivity, reverberation and interference. Furthermore, many acoustic localization approaches estimate speaker direction, hence providing bearing-only measurements without range information. This paper presents a probability hypothesis density (PHD) tracker that augments the bearing-only speaker directions of arrival with a cloud of range hypotheses at speaker initiation and propagates the random variates through time. Furthermore, due to their formulation PHD filters explicitly model, and hence provide robustness against, clutter and missing detections. The approach is verified using experimental results

    Acoustic SLAM

    Get PDF
    An algorithm is presented that enables devices equipped with microphones, such as robots, to move within their environment in order to explore, adapt to and interact with sound sources of interest. Acoustic scene mapping creates a 3D representation of the positional information of sound sources across time and space. In practice, positional source information is only provided by Direction-of-Arrival (DoA) estimates of the source directions; the source-sensor range is typically difficult to obtain. DoA estimates are also adversely affected by reverberation, noise, and interference, leading to errors in source location estimation and consequent false DoA estimates. Moroever, many acoustic sources, such as human talkers, are not continuously active, such that periods of inactivity lead to missing DoA estimates. Withal, the DoA estimates are specified relative to the observer's sensor location and orientation. Accurate positional information about the observer therefore is crucial. This paper proposes Acoustic Simultaneous Localization and Mapping (aSLAM), which uses acoustic signals to simultaneously map the 3D positions of multiple sound sources whilst passively localizing the observer within the scene map. The performance of aSLAM is analyzed and evaluated using a series of realistic simulations. Results are presented to show the impact of the observer motion and sound source localization accuracy

    Room geometry estimation from a single channel acoustic impulse response

    No full text
    For a 2D rectangular room of unknown dimensions and with unknown source and microphone positions, the times of arrival of reflections can be described in terms of im-age source positions. Adopting a microphone-centred co-ordinates system, it is shown that to satisfy certain combi-nations of arrival times imposes constraints on the possible room geometry: a second-order reflection from adjacent walls determines the source-microphone distance; a second-order reflection from opposite walls in a given dimension determines the source displacement in that dimension as a function of the source-receiver distance. Given a subset of time differences of arrival, the extent to which the geometry can be determined is related to these constraints. The geome-try estimation is further posed as a least squares optimisation problem whose results verify the analytical results. Index Terms — geometry estimation, acoustic impulse response, time of arrival, TOA, room identification 1

    Performance analysis of dynamic acoustic source separation in reverberant rooms

    No full text
    Published versio

    A quantitative assessment of group delay methods for identifying glottal closures in voiced speech

    No full text
    Published versio
    corecore